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Abstract — Solid-state drives (SSDs) liave been widely de- 
ployed in desktops and data centers. However, SSDs suffer 
from bit errors, and the bit error rate is time dependent 
since it increases as an SSD wears down. Traditional storage 
systems mainly use parity-based RAID to provide reliability 
guarantees by striping redundancy across multiple devices, but 
the effectiveness of RAID in SSDs remains debatable as parity 
updates aggravate the wearing and bit error rates of SSDs. 
In particular, an open problem is that how different parity 
distributions over multiple devices, such as the even distribu- 
tion suggested by conventional wisdom, or uneven distributions 
proposed in recent RAID schemes for SSDs, may influence the 
reliability of an SSD RAID array. To address this fundamental 
problem, we propose the first analytical model to quantify the 
reliability dynamics of an SSD RAID array. Specifically, we 
develop a "non-homogeneous" continuous time Markov chain 
model, and derive the transient reliability solution. We validate 
our model via trace-driven simulations and conduct numerical 
analysis to provide insights into the reliability dynamics of SSD 
RAID arrays under different parity distributions and subject 
to different bit error rates and array configurations. Designers 
can use our model to decide the appropriate parity distribution 
based on their reliability requirements. 

Keywords -Solid-state Drives; RAID; ReUability; CTMC; 
Transient Analysis 

L Introduction 

Solid-state drives (SSDs) emerge to be the next-generation 
storage medium. Today's SSDs mostly build on NAND flash 
memories, and provide several design enhancements over 
hard disks including higher I/O performance, lower energy 
consumption, and higher shock resistance. As SSDs continue 
to see price drops nowadays, they have been widely deployed 
in desktops and large-scale data centers ifTOl . lfT4l . 

However, even though enterprise SSDs generally provide 
high reliability guarantees (e.g., with mean-time-between- 
failures of 2 million hours ITTI), they are susceptible to 
wear-outs and bit errors. First, SSDs regularly perform erase 
operations between writes, yet they can only tolerate a 
limited number of erase cycles before wearing out. For 
example, the erasure limit is only lOK for multi-level cell 
(MLC) SSDs Q, and even drops to several hundred for the 
latest triple-level cell (TLC) SSDs |13|. Also, bit errors are 
common in SSDs due to read disturbs, program disturbs, and 
retention errors llT2l . llTSl . Il27l . Although in practice SSDs 
use error correction codes (ECCs) to protect data [|8], lE6l . 
the protection is limited since the bit error rate increases as 
SSDs issue more erase operations lfT2l . ||27| . We call a post- 
ECC bit error an uncorrectable bit error. Furthermore, bit 



errors become more severe when the density of flash cells 
increases and the feature size decreases llT3l . Thus, SSD 
reliability remains a legitimate concern, especially when an 
SSD issues frequent erase operations due to heavy writes. 

RAID (redundant array of independent disks) lISTI pro- 
vides an option to improve reliability of SSDs. Using parity- 
based RAID (e.g., RAID-4, RAID-5), the original data is 
encoded into parities, and the data and parities are striped 
across multiple SSDs to provide storage redundancy against 
failures. RAID has been widely used in tolerating hard 
disk failures, and conventional wisdom suggests that parities 
should be evenly distributed across multiple drives so as 
to achieve better load balancing, e.g., RAID-5. However, 
traditional RAID introduces a different reliability problem 
to SSDs since parities are updated for every data write and 
this aggravates the erase cycles. To address this problem, 
authors in ||2l propose a RAID scheme called Diff-RAID 
which aims to enhance the SSD RAID reliability by keeping 
uneven parity distributions. Other studies (e.g., |[T6) . ll20l - 
1221 . 123, E0|) also explore the use of RAID in SSDs. 

However, there remain open issues on the proper architec- 
ture designs of highly reliable SSD RAID Iil9| . One specific 
open problem is how different parity distributions generally 
influence the reliability of an SSD RAID array subject to 
different error rates and array configurations. In other words, 
should we distribute parities evenly or unevenly across 
multiple SSDs with respect to the SSD RAID reliability? 
This motivates us to characterize the SSD RAID reliability 
using analytical modeling, which enables us to readily tune 
different input parameters and determine their impacts on 
reliability. However, analyzing the SSD RAID reliability is 
challenging, as the error rates of SSDs are time-varying. 
Specifically, unlike hard disk drives in which error arrivals 
are commonly modeled as a constant-rate Poisson process 
(e.g., see 1281 . 133JI ). SSDs have an increasing error arrival 
rate as they wear down with more erase operations. 

In this paper, we formulate a continuous time Markov 
chain (CTMC) model to analyze the effects of different 
parity placement strategies, such as traditional RAID-5 and 
Diff-RAID [21, on the reliability dynamics of an SSD RAID 
array. To capture the time-varying bit error rates in SSDs, we 
formulate a non-homogeneous CTMC model, and conduct 
transient analysis to derive the system reliability at any 
specific time instant. To our knowledge, this is the first 
analytical study on the reliability of an SSD RAID array. 

In summary, this paper makes two key contributions; 



• We formulate a non-homogeneous CTMC model to 
characterize the reliability dynamics of an SSD RAID 
array. We use the uniformization technique Q, llTSl . 
1321 to derive the transient reliability of the array. 
Since the state space of our model increases with 
the SSD size, we develop optimization techniques to 
reduce the computational cost of transient analysis. We 
also quantify the corresponding error bounds of the 
uniformization and optimization techniques. Using the 
SSD simulator |[T], we validate our model via trace- 
driven simulations. 

• We conduct extensive numerical analysis to compare 
the reUabiUty of an SSD RAID array under RAID-5 
and Diff-RAID (2l. We observe that Diff-RAID, which 
places parities unevenly across SSDs, only improves 
the reliability over RAID-5 when the error rate is not 
too large, while RAID-5 is reliable enough if the error 
rate is sufficiently small. On the other hand, when the 
error rate is very large, neither RAID-5 nor Diff-RAID 
can provide high reliability, so increasing fault tolerance 
(e.g., RAID-6 or a stronger ECC) becomes necessary. 

The rest of this paper proceeds as follows. In Section |II] 
we formulate our model that characterizes the reliability 
dynamics of an SSD RAID array, and formally define the 
reliability metric. In Section |III1 we derive the transient 
system state using uniformization and some optimization 
techiniques. In Section HVl we validate our model via trace- 
driven simulations. In Section |V] we present numerical 
analysis results on how different parity placement strategies 
influence the RAID reliability. Section [Vl] reviews related 
work, and finally Section FVlIl concludes. 

II. System Model 

It is well known that RAID-5 is effective in providing 
single-fault tolerance for traditional hard disk storage. It 
distributes parities evenly across all drives and achieves 
load balancing. Recently, Balakrishnan et al. 121 report that 
RAID-5 may result in correlated failures, and hence poor 
reliability, for SSD RAID arrays if SSDs are worn out 
at the same time. Thus, they propose a modified RAID 
scheme called Diff-RAID for SSDs. Diff-RAID improves 
RAID-5 through (i) distributing parties unevenly and (ii) 
redistributing parities each time when a worn-out SSD is 
replaced so that the oldest SSD always has the most parities 
and wears out first. However, it remains unclear whether 
Diff-RAID (or placing parities unevenly across drives) really 
improves the reliability of SSD RAID over RAID-5 in all 
error patterns, as there is a lack of comprehensive studies on 
the reliability dynamics of SSD RAID arrays under different 
parity distributions. 

In this section, we first formulate an SSD RAID array, 
then characterize the age of each SSD based on the age of the 
array (we will formally define the concept of age in later part 
of this section). Lastly, we model the error rate based on the 



age of each SSD, and formulate a non-homogeneous CTMC 
to characterize the reliability dynamics of an SSD RAID 
array under various parity distributions, including different 
parity placement distributions like RAID-5 or Diff-RAID. 
Table H] lists the major notations used in this paper 

Specific Notations of SSD 



M 
B 

Mi) 



Erasure limit of each block (e.g., lOK) 
Total number of blocks in each SSD 
En'or rate of a chunk in SSD i at time t 



Specific Notations of RAID Array 



A'' : Number of data drives (i.e., an array has A*' + 1 SSDs) 
S : Total number of stripes in an SSD RAID array 
Pi : Fraction of parity chunks in SSD i, and X]i=o Ps — 1 
k : Total number of erasures performed on SSD RAID aiTay 

(i.e., system age of the array) 
ki : Number of erasures performed on each block of SSD i 

(i.e., age of SSD i) 
T : Average inter-arrival time of two consecutive erasure 

operations on SSD RAID array 
7rj(f) : Probability that the array has j stripes that contain 

exactly one erroneous chunk each, (0 < j < S) 
7rs+i(f): Probability that at least one stripe of the array contains 

more than one erroneous chunk, so ^ .^^ tvj (t) — 1 
R{t) : Reliability at time t, i.e., probability that no data loss 

happens until time t, R{t) = Yl =o ''^ji^) 

Table I: Notations. 



A. SSD RAID Formulations 

An SSD is usually organized in blocks, each of which 
typically contains 64 or 128 pages. Both read and program 
(write) operations are performed in unit of pages, and each 
page is of size 4KB. Data can only be programmed to clean 
pages. SSDs use an erase operation, which is performed 
in unit of blocks, to reset all pages in a block into clean 
pages. To improve write performance, SSDs use out-of-place 
writes, i.e., to update a page, the new data is programmed to 
a clean page while the original page is marked as invalid. An 
SSD is usually composed of multiple chips (or packages), 
each containing thousands of blocks. Chips are independent 
of each other and can operate in parallel. We refer readers 
to |T| for a detailed description about the SSD organization. 

We now describe the organization of an SSD RAID array 
that we consider, as shown in Figure [T] We consider the 
device-level RAID organization where the array is composed 
of iV+1 SSDs numbered from to N. In this paper, we 
address the case where the array is tolerable against a single 
SSD failure, as assumed in traditional RAID-4, RAID-5 
schemes and the modified RAID schemes for SSDs 111, llT6ll . 
GOl-lEII, ES), IMI- Each SSD is divided into multiple non- 
overlapping chunks, each of which can be mapped to one 
or multiple physical pages. The array is further divided into 
stripes, each of which is a collection of A^ + 1 chunks from 
the iV + 1 SSDs. Within a stripe, there are N data chunks, 
and one parity chunk encoded from the N data chunks. 



stripe 



• • • 



drive #: 



parity 
dist: 
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data |<=^ parity ^^ 
chunl<:l J chunl<: ^^ 

Figure 1: Organization of an SSD RAID array. 



We call a chunk an erroneous chunk when uncorrectable bit 
errors appear in that chunk; or a correct chunk otherwise. 
Since we focus on single-fault tolerance, we require that 
each stripe contains at most one erroneous chunk without 
data loss so that it can be recovered from other surviving 
chunks in the same stripe. 

Suppose that each SSD contains B blocks, and the array 
contains S stripes (i.e., S chunks per SSD). For simplicity, 
we assume that all S stripes are used for data storage. 
To generalize our analysis, we organize parity chunks in 
the array according to some probability distribution. We let 
SSD i contain a fraction pi of parity chunks. In the special 
case of RAID-5, parity chunks are evenly placed across all 
devices, so pi = j^ti for all i if the array consists of A^ + 1 
drives. For Diff-RAID, p^'s do not need to be equal to yyVj-, 

but only need to satisfy the condition of X]i=oP« ~ ^■ 

Each block in an SSD can only sustain a limited number 
of erase cycles, and is supposed to be worn out after the 
limit. We denote the erasure limit by M, which corresponds 
to the lifetime of a block. To enhance the durability of SSDs, 
efficient wear-leveling techniques are often used to balance 
the number of erasures across all blocks. In this paper, we 
assume that each SSD achieves perfect wear-leveling such 
that every block has exactly the same number of erasures. 
Let ki {< M) be the number of erasures that have been 
performed on each block in SSD i, where < i < A^. We 
denote ki as the age of each block in SSD i, or equivalently, 
the age of SSD i when perfect wear-leveling is assumed. 
When an SSD reaches its erasure limit, we assume that it 
is replaced by a new SSD. For simplicity, we treat ki as 
a continuous value in [0, M]. Let k be the total number of 
erase operations that the whole array has processed, and we 
call k the system age of the array. 

B. SSD Age Characterization 

In this subsection, we proceed to characterize the age 
of each SSD for a given RAID scheme. In particular, we 
derive ki, denoting the age of SSD i, when the whole array 
has akeady performed a total of k erase operations. This 



characterization enables us to model the error rate in each 
SSD accurately (see Section Ill-Cb . We focus on two RAID 
schemes: traditional RAID and Diff-RAID Q. 

We first quantify the aging rate of each SSD in an array. 
Let Ti be the aging rate of SSD i. Note that for each stripe, 
updating a data chunk also has the parity chunk updated. 
Suppose that each data chunk has the same probability of 
being accessed. On average, the ratio of the aging rate of 
SSD i to that of SSD j can be expressed as ||2j: 

n ^ p^N + {l~p,) 

r, p,N+il-p,y 

Equation ([U states that the parity chunk ages TV times faster 
than each data chunk. Given the aging rates r/s, we can 
quantify the probability of SSD i being the target drive for 
each erase operation, which we denote by qi. We model qi 
by making it proportional to the aging rate of SSD i, i.e.. 



ft 



E 



TV 



p^N+{l-p^) 

Elo(p^N+{i-p,)y 



(2) 



We now characterize the age of Diff-RAID which places 
parities unevenly and redistributes parity chunks after the 
worn-out SSD is replaced so as to maintain the age ratios 
and always wear out the oldest SSD first. To mathematically 
characterize the system age of Diff-RAID, define Ai as the 
remaining fraction of erasures that SSD i can sustain right 
after an SSD replacement. Clearly, Ai ~ 1 for a brand- 
new drive and Ai = for a worn-out drive. Without loss 
of generality, we assume that the drives are sorted by Ai 
in descending order, i.e., Aq > Ai > ■ ■ ■ > An, and we 
have ^0 = 1 as it is the newly replaced drive. Diff-RAID 
performs parity redistribution to guarantee that the aging 
ratio in Equation ^ remains unchanged. Therefore, the 
remaining fraction of erasures for each drive will converge, 
and the values of A/s in the steady state are given by fj): 



A,= 



T^ r 



^N 



__^ r;=MN+{i^p,)) 



0<i<N. (3) 



.j=0 ' 3 L^] 

In this paper, we study Diff-RAID after the age distribution 
of SSDs right after each drive replacement converges, i.e., 
the initial remaining fractions of erasures of SSDs in Diff- 
RAID follow the distribution of A^'s in Equation ([3]). 

We now characterize ki for Diff-RAID. Recall that each 
SSD has B blocks. Due to perfect wear-leveling, every block 
of SSD i has the same probability qijB of being the target 
block for an erase operation. Thus, if the array has processed 
k erase operations, the age of SSD i is: 

(kq, 

- IIIUU 

where fc^o = ^i^X — Ai) is the initial number of times 
that each block of SSD i has been erased right after a 
drive replacement, and the notation mod denotes the modulo 
operation. The rationale of Equation (|4]i is as follows. Since 



Diff-RAID: k, = (— mod —(M-km)) +ho, 
\ B Oat / 



(4) 



we sort the SSDs by Ai in descending order, SSD N always 
has the highest aging rate and will be replaced first. Thus, 
after each block of SSD N has performed (M — kiyo) 
erasures, SSD N will be replaced, and each block of SSD i 
has just been erased -^{M — kNo) times. Therefore, for 
SSD i, a drive replacement happens when each block has 
been erased every -^{M ~kNo) times. Moreover, the initial 
number of erasures on each block of SSD i right after a 
drive replacement is kio- Thus, the age of SSD i is derived 
as in Equation (|4]i. Since fc^o = M{1 — Ai) and An = q^. 
Equation Q can be rewritten as: 

Diff-RAID: h = {{kq.,/B) mod Mq^)+M{l-A^). (5) 

For traditional RAID (e.g., RAID-4 or RAID-5), parity 
chunks are kept intact, and will not be redistributed after a 
drive replacement. So after the array has performed k erase 
operations, each block of SSD i has just performed kqi/B 
erasures, and an SSD will be replaced every time when each 
block performed M erasures. Thus, the age of SSD i is: 

Traditional RAID: h ^ {kq,/B) mod M. (6) 

C. Continuous Time Markov Chain (CTMC) 

We first model the error rate of an SSD. We assume that 
the error arrival processes of different chunks in an SSD 
are independent. Since different chunks in an SSD have the 
same age, they must have the same error rate. We let A; (t) 
represent the error rate of each chunk in SSD i at time t, and 
model it as a function of the number of erasures on SSD i 
at time t, which is denoted by ki (i) (the notation t may be 
dropped if the context is clear). Furthermore, to reflect that 
bit errors increase with the number of erasures, we model 
the error rate based on a Weibull distribution IImI . which 
has been widely used in reliability engineering. Formally, 



st^Mt) (s-i)f^A,(t) |;^(,) 



A,(t) = ca{h{t)y 



a>l, 



(7) 



where a is called the shape parameter and c is a constant. 

Note that even if the error rates of SSDs are time-varying, 
they only vary with the number of erasures on the SSDs. If 
we let tk be the time point of the fc*'' erasure on the array, 
then during the period {tk,tk+i) (i.e., between the fc*'* and 
(fc + 1)*'' erasures), the number of erasures on each SSD 
is fixed, hence the error rates during this period should be 
constant, and the error arrivals can be modeled as a Poisson 
process. In particular, ki{t) = ki{k) if i G (ifc,ifc+i), and 
the function ki{k) is expressed by Equation ^ and (|6]l. 

We now formulate a CTMC model to characterize the 
reliability dynamics of an SSD RAID array. Recall that the 
array provides single-fault tolerance for each stripe. We say 
that the CTMC is at state i if and only if the array has i 
stripes that contain exactly one erroneous chunk each, where 
< i < S*. Data loss happens if any one stripe contains more 
than one erroneous chunk, and we denote this state by S+1. 
Let X{t) be the system state at time t. Formally, we have 




s\MO 



Figure 2: State transition of the non-homogeneous CTMC. 



X{t) e {0, 1, ..., 5'+ 1}, Vi > 0. To derive the system state, 
we let TTj {t) be the probability that the CTMC is at state j at 
time t {Q<j< S+1), so the system state can be characterized 
by the vector 7r(i) = (7ro(i),7ri(i), ...,7rs+i(t)). 

Let us consider the transition of the CTMC. For each 
stripe, if it contains one erroneous chunk, then the erroneous 
chunk can be reconstructed from the other surviving chunks 
in the same stripe. Assume that only one stripe can be 
reconstructed at a time, and that the reconstruction time 
follows an exponential distribution with rate /i. The state 
transition diagram of the CTMC is depicted in Figure |2] To 
elaborate, suppose that the RAID array is currently at state 
j, if an erroneous chunk appears in one of the (S—j) stripes 
that originally have no erroneous chunk, then it will move to 
state j+1 with rate {S—j) J2i=a ^i{t)\ if an erroneous chunk 
appears in one of the j stripes that already have another 
erroneous chunk, then the system will move to state 5 + 1 
(in which data loss occurs) with rate jX]i=o ^j(*)- 

We now define the reliability of an SSD RAID array at 
time t, and denote it by R{t). Formally, it is the probability 
that no stripe has encountered data loss until time t. 



Note that our model captures the time-varying nature of 
reliability over the lifespan of the SSD RAID array. Next, 
we show how to analyze this non-homogeneous CTMC. 

III. Transient Analysis of CTMC 

In this section, we derive 7r(t), the system state of an 
SSD RAID array at any time t. Once we have 7r(t), we can 
then compute the instantaneous reliability R{t) according to 
Equation (O. There are two major challenges in deriving 
n{t). First, it involves transient analysis, which is different 
from the conventional steady state Markov chain analysis. 
Second, the underlying CTMC {X{t),t > 0} is non- 
homogeneous, as the error arrival rate Ai (t) is time varying, 
and it also has a very large state space. 

In the following, we first present the mathematical foun- 
dation on analyzing the non-homogeneous CTMC so as 
to compute the transient system state, then formahze an 



algorithm based on the mathematical analysis. At last, we 
develop an optimization technique to address the challenge 
of large state space of the CTMC so as to further reduce the 
computational cost of the algorithm. 

A. Mathematical Analysis on the Non-homogeneous CTMC 

Note that the error rates of SSDs within a period (t^ , tk+i) 
(/c = 0, 1, 2, ...) are constant, so if we only focus on a partic- 
ular time period of the CTMC, i.e., {X{t),tj, <t < t^+i}, 
then it becomes a time-homogeneous CTMC. Therefore, the 
intuitive way to derive the transient solution of the CTMC 
{X{t),t > 0} is to divide it into many time-homogeneous 
CTMCs {X{t),tk < t < tk+i} {k = 0,1,2...), then use 
the imifonnization technique Q, ifTSl . Il32l to analyze these 
time-homogeneous CTMCs one by one in time ascending 
order Specifically, to derive 7r(ifc+i), one first derives 7r(<i) 
from the initial state 7r(0), then takes 7r(ti) as the initial 
state and derives 7r(f2) from 7r(<i) and so on. 

However, this computational approach may take a pro- 
hibitively long time to derive 7r(t/j+i) when k is very large, 
which usually occurs in SSDs. Since k denotes the number 
of erasures performed on an SSD RAID array, it can grow 
up to {N + 1)BAI, where both B (the number of blocks in 
an SSD) and AI (the erasure limit) could be very huge, say, 
lOOK and lOK, respectively (see Sec.[Vl). Therefore, simply 
applying the uniformization technique is computationally 
infeasible to derive the reliability of an SSD RAID array, 
especially when the array performs a lot of erasures. 

To overcome the above challenge, we propose an opti- 
mization technique which combines multiple time periods 
together The main idea is that since the difference of the 
generator matrices at two consecutive periods is very small 
in general, we consider s consecutive periods together, where 
s is called the step size. For simplicity of discussion, let T 
be the average inter-arrival time of two consecutive erasure 
operations, i.e., tk ~ kT. To analyze the non-homogeneous 
CTMC over s periods {X{t)JsT < t < {I + l)sT} 
{I = 0, 1, ...), we define another time-homogeneous CTMC 
{X{t),lsT < t < {l + l)sT} to approximate it and also 
quantify the error bound. The derivation of 7v{(l + l)sT) 
given tt{IsT) proceeds as follows. 

Step 1: Constructing a time-homogeneous CTMC 
{X{t),lsT < t < {l + l)sT} with generator matrix Q,. 
Note that there are s periods in the interval [IsT, {l+l)sT). 
We denote the generator matrices of the original Markov 
chain {X{t)} during each of the s periods by Qi^, Q/^+i, 
• •• , Q(;+i)s_i. To construct {X{t),lsT < t < (l + l)sT}, 
we define Q; as a function of the s generator matrices. 



4 = ./(Qi,s,Qis+i,-,Q(m)s-i), ^ = 0,1, 



(9) 



Intuitively, Qi can be viewed as the "average" over the 
s generator matrices. To illustrate, consider a special case 
where a in Equation ^ is set to be a = 2. Then the error 



(li.jik)=< 



(10) 



arrival rate of each chunk of SSD i becomes 2cki. In this 
case, each element of the generator matrix Q^ becomes 

-ST., i=j = 0, 

— fi — ST, 0<i<S, j = i, 

iS-i)T, 0<i<S, j = t + l, 

iE, 0<i<S', j = S+l, 

fx, 0<i<S, j ^ i-1, 

0, otherwise , 

where E — X^i^o^*^^* ^"'^ ^« ^^ computed by Equations Q 
and ^. Now, for the Markov chain X{t), we let Qi be an 
average of these s generator matrices Q^- Mathematically, 



Q^ = (E 






Qk /s, 1 = 0,1,. 



(11) 



Note that our analysis is applicable for other values of 
a, with different choices of defining Qi in Equation ^ 
and different error bounds. We pose the further analysis 
of different values of a as future work. In the following 
discussion, we fix a = 2, whose error bound can be derived. 
Step 2: Deriving the system state 7r((/ + l)sr) under the 
time-homogeneous CTMC {X{t)}. To derive the system 
state at time {l+l)sT, which we denote as 7v{{l+l)sT), we 
solve the Kolmogorov's forward equation and we have 



iv{{l+l)sT)=iv{l,T)^ 

^ -^71 — 



{QisTr/n\, I = 0,1,... (12) 



where the initial state is 7r(0) = 7r(0). 

Step 3: Applying uniformization to solve Equation ( fl2l l. 

We let A/ > max;,,<fe<(;+i),,_i maxo<i<s+i \~qi,t{k)\, and 



let P; = /+ x^. Based on the uniformization technique Q, 
the system state at time {l + \)sT can be derived as follows. 



^ — ^n— 



oo .^^^{AisTT 



vi{n), 1 = 0,1,... (13) 



where vi{n) = vi{n — l)Pi and vi{0) = ■7v{lsT). The initial 
state is -n-(O) = 7r(0). 

Step 4: Truncating the infinite summation in Equa- 
tion (Il3t with a quantifiable error bound. We denote 
the truncation point for interval {IsT, {l + l)sT) by Ui and 
denote the system state at time {l+l)sT after truncation by 
7r{{l+l)sT). We also denote the error caused by combining s 
periods together and truncating the infinite series in interval 
[IsT, {l+l)sT) by Q||^((? + l)sT)-7r((/ + l).sr)||i, where 
7r((/ + l)sr) denotes the accurate system state obtained 
by iteratively analyzing the time-homogeneous CTMCs 
{X{t),kT < t < {k + l)T} (k = 0,l,...,(/ + l)s - 1) 
from the initial state 7r(0). Now, 7r((^+l)sr) and e; can be 
computed using the following theorem. 
Theorem 1: After truncating the infinite series, the system 
state at time (/ + l)sT for the Markov chain {X{t)} with 
step size s can be computed as follows. 



m+i).sT)=j2 



u^ ^„A,.st(M1: 



n=0 



n< 



viin), 1 = 0,1,... (14) 



where vi{n) = vi{n — l)Pi and vi{0) = 7r{lsT). The initial 
state is 7r(0) = 7r(0). The error is bounded as follows. 



£/ < £/- 



1 ^L^n=0 



'AisT 



{MsTY 



/ = 0,1,... (15) 



where ?o = ||7r(0) - 7r(0)||i = 0. 

Proof: Please refer to Appendix. ■ 

B. Algorithm for Computing System State 

In the last subsection, we present the mathematical foun- 
dation on computing the system state of SSD RAID arrays 
and the corresponding error bounds. We now present the 
algorithm to compute Tz{t) according to Theorem [T] In 
particular, we aim to compute the system state at the time 
when the k^^ erasure operation has just occurred, i.e., 
7r(fcr). Without loss of generality, we assume that k is an 
integer multiple of the step size s. Moreover, we denote the 
maximum acceptable error by e. 

Algorithm 1 Algorithm for Computing System State 7r(fcT) 

Input: Step size s, maximum error e and initial state 7r(0) = 7r(0) 
Output: System state at time kT: n{kT) 
1: for / = ^ -^ - 1 do 



Q„ 



Let Q, = ^^ 

Choose A; > raa.xis<m<(i+i)a maxo<i<s+i | - qi,i{m)\; 

Let P; =/+2l; 

Initialize: h <- 0; n <- 0; 7r((Z + l)sT) <- 0; vi{0) ^ 
while 1 - q > f do 

~A,sT (A,sT)" . 



e; ^ £! + e 

^{{l + l)sT)- 
n •<— 71 + 1; 
vi{n) <— vi{n 
end while 
end for 



fr{{l + l)sT) + e-^'^^ ^^p^viin); 
l)Pi; 



Algorithm \T\ describes the pseudo-code of the algorithm. 
Lines 2 to 11 are to derive the system state in one interval 
with s time periods based on the flow in Section IIII-AI 
In particular. Line 2 constructs the generator matrix of our 
defined CTMC {X{t)}. Lines 3 to 5 initialize the necessary 
parameters. Lines 6 to 11 implement Equation (fT4t . while 
the truncation point is determined based on Equation (fT5t 
and the given maximum error Note that the condition in 
Line |6] indicates that the maximum allowable error in one 
interval is ^, as there are ^ intervals and the aggregate 
maximum allowable error is e. After computing the system 
state at time kT using Algorithm [T] we can easily compute 
the RAID reliability based on the definition in Equation (O. 

Our implementation of Algorithm [T] uses the following 
inputs. We fix s == BM/20, meaning that for each SSD, we 
consider at least 20 time points before it reaches its lifetime 
of BM erasures. The error bound is fixed at e = 10~^. We 
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Figure 3: State transition after truncation. 



also set 7ro(0) = 1 and 7rj(0) = for < j < 5 + 1 to 
indicate that the array has no erroneous chunk initially. 

Note that the dimension of the matrix Pi is {S + 2) x 
[S + 2) {S is the number of stiipes), which could be very 
large for large SSDs. To further speed up our computation, 
we develop another optimization technique by truncating 
the states with large state numbers from the CTMC so 
as to reduce the dimension of P;. Intuitively, if an array 
contains many stripes with exactly one erroneous chunk, it 
is more likely that a new erroneous chunk appears in one 
of such stripes (and hence data loss occurs) rather than in 
a stripe without any erroneous chunk. That is, the transition 
rate qi.i+i becomes very small when i is large. We can 
thus remove such states with large state numbers without 
losing accuracy. We present the details of the optimization 
technique in the next subsection. 

C. Reducing Computational Cost of AlgorithmU] 

Note that when state number i increases, the transition 
rate qi,i^i{k) decreases while the transition rate qi,s+i{k) 
increases. This indicates that the higher the current state 
number is, the harder it is to transit to states with larger 
state number, while it is easier to transit to the state of data 
loss, or state 5+1. The physical meaning is that the system 
will not contain too many stripes with exactly one erroneous 
chunk as either the erroneous chunk will be recovered, or 
another error may appear in the same stripe so that data 
loss happens. Therefore, to reduce the computational cost 
when derive the system state, we can truncate the states 
with large state number so as to reduce the state space of 
the Markov chain. Specifically, we truncate the states with 
state number bigger than E, and let E+1 represents the case 
when more than E stripes contain exactly one erroneous 
chunk. Moreover, we take state £'+ 1 as an absorbing state. 
Furthermore, we denote the state of data loss by E+2. Now, 
the state transition can be illustrated in Figure |3] 

To compute the system state after states truncation, we 
denote the new CTMC by {X{t),t > 0}, the new generator 
matrix during period {kT, {k+ 1)T) by Q^., and the system 
state at time {k+l)T by 7r((fc+l)'r). We use notations with 
a bar to represent the case when system states of the CTMC 



are truncated if the context is clear Similar to Equation (IT2] |. 
given the initial state -r:{kT), the system state at time (fc + 
1)T for the CTMC {X{t), t > 0} can be derived as follows. 



7r{{k + l)T)^n{kT)J2 



iOkTY' 



(16) 



n=0 



If we denote the error caused by truncating the states at time 
kT by Cfc, then e^ can be formally defined as follows. 



ezj = max liTiikT) 

0<i<E 



MkT)l 



where Tii{kT) represents the probability of system being at 
state i at time kT for the CTMC {X{t),t > 0}, i.e., the 
Markov chain after states truncation, and Tii{kT) represents 
the probability of the system being at state i at time kT for 
the original CTMC {X{t),t > 0}. Clearly, eo == as the 
two Markov chains have the same initial states, i.e., 7fi(0) = 
7ri(0). The bound of the error caused by states truncation is 



Cfe < TTE+likT). 



(17) 



Again, we can also follow the steps in Section ITlI-AI i.e., 
use Algorithm[Tl to compute the system state for the Markov 
chain after states truncation {X{t),t>0}. 

IV. Model Validation 

In this section, we validate via trace-driven simulation 
the accuracy of our CTMC model on quantifying the RAID 
reUabiUty R{t). We use the Microsoft's SSD simulator lH] 
based on DiskSim fJl. Since each SSD contains multiple 
chips that can be configured to be independent of each other 
and handle I/O requests in parallel, we consider RAID at 
the chip level (as opposed to device level) in our DiskSim 
simulation. Specifically, we configure each chip to have its 
own data bus and control bus and treat it as one drive, and 
also treat the SSD controller as the RAID controller where 
parity-based RAID is built. 

To simulate error arrivals, we generate error events based 
on Poisson arrivals given the current system age k of the 
array. As the array ages, we update the error arrival rates 
accordingly by varying the variable fci(t) in Equation (|7]). We 
also generate recovery events whose recovery times follow 
an exponential distribution with a fixed rate /i = 1. Both 
error and recovery events are fed into the SSD simulator as 
special types of I/O requests. We consider three cases: error 
dominant, comparable, and recovery dominant, in which the 
error rate is larger than, comparable to, and smaller than the 
recovery rate, respectively. 

Our validation measures the reliability of the traditional 
RAID and Diff-RAID with different parity distributions. 
Recall that Diff-RAID redistributes the parities after each 
drive replacement, while traditional RAID does not. We 
consider {N +1) chips where N ~ 3,5,7. For traditional 
RAID, we choose RAID -5, in which parity chunks are 
evenly placed across the chips; for Diff-RAID, 10% of parity 



chunks placed in each of the A^ chips and the remaining 
parity chunks are placed in the last flash chip. 

We generate synthetic uniform workload in which the 
write requests access the addresses of the entire address 
space with equal probability. The workload lasts until all 
drives are worn out and replaced at least once. We run 
the DiskSim simulation 1000 times, and in each run we 
record the age when data loss happens. Finally, we derive 
the probability of data loss and the reliability based on 
our definitions. To speed up our DiskSim simulation, we 
consider a small-scale RAID array, in which each chip 
contains 80 blocks with 64 pages each, and the chunk size 
is set to be the same as the page size 4KB. We also set a 
low erasure limit at AI = 100 cycles for each block. 

Figure |4] shows the reliability R{t) versus the system age 
k obtained from both the model and DiskSim results. We 
observe that our model accurately quantifies the reliability 
for all cases. Also, Diff-RAID shows its benefit only in 
the comparable case. In the error dominant case, traditional 
RAID always shows higher reliability than Diff-RAID; in 
the recovery dominant case, there is no significant difference 
between traditional RAID and Diff-RAID. We will further 
discuss these findings in Section |V] 

V. Numerical Analysis 

In this section, we conduct numerical analysis on the 
reliability dynamics of a large-scale SSD RAID array with 
respect to different parity placement strategies. To this end, 
we summarize the lessons learned from our analysis. 

A. Choices of Default Model Parameters 

We first describe the default model parameters used in our 
analysis, and provide justifications for our choices. 

We consider an SSD RAID array composed of A^ + 1 
SSDs, each being modeled by the same set of parameters. By 
default, we set A^ = 9. Each block of an SSD has 64 pages of 
size 4KB each. We consider 32GB SSDs with B = 131, 072 
blocks. We configure the chunk size equal to the block size, 
i.e., there are S" = B = 131,072 chunk^H. We also have 
each block sustain M =10K erase cycles. 

We now describe how we configure the error arrival rate, 
i.e., \i ~ 2cki, by setting the constant c. We employ 4-bit 
ECC protection per 512 bytes of data, the industry standard 
for today's MLC flash. Based on the uncorrectable bit error 
rates (UBERs) calculated in IJ], we choose the UBER in 
the range [10~^^, 10~^®] when an SSD reaches its rated 
lifetime (i.e., the erasure limit M is reached). Since we set 
the chunk size to be equal to the block size, the probability 
that a chunk contains at least one bit error is roughly in 
the range of [2 x 10~^°, 2 x 10~^^]. Based on the analysis 
on real enterprise workload traces ll29l . an RAID array can 

'in practice, SSDs are over-provisioned |T|, so the actual number of 
blocks (or chunks) that can be used for storage (i.e., S) should be smaller. 
However, the key observations of our results here still hold. 
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have several hundred gigabytes of data being accessed per 
day. If the write request rate is set as 1TB per day (i.e., 50 
blocks per second), then the error arrival rate per chunk at 
its rated lifetime (i.e., Ai = 2cM) is approximately in the 
range [10~^, 10^^"]. The corresponding parameter c is in 
the range [0.5 x IQ-^^ 0.5 x 10""]. 



For the error recovery rate /i, we note that the aggregate 
error arrival rate when all A^ + 1 drives are going to die out 
is 2cMS{N + 1). If iV = 9, then the aggregate error arrival 
rate is roughly in the range [10~^, 10~^]. We fix fi — 10"'^. 

We compare different cases when the error arrivals are 
more dominant than error recoveries, and vice versa. We 
consider three cases of error patterns: c = 1.1 x lO^^'^, 
c = 0.4 X 10"^'^, and c = 0.1 x 10"^'^, which correspond 
to the error dominant, comparable, and recovery dominant 
cases, respectively. Specifically, when c = 0.4 x 10^^"^, the 
aggregate error arrival rate of the array when all SSDs reach 
their rated lifetime is around 2cMS{N + 1) w 10~^ (where 



N :^9, M =10K, and S = 131,072). 

We now configure T, the time interval between two 
neighboring erase operations. Suppose that there are 1TB 
of writes per day as described above. The inter-arrival time 
of write requests is around 3 x lO^"' seconds for 4KB page 
size. Thus, the average time between two erase operations is 
1.9 X 10^^ seconds as an erase is triggered after writing 64 
pages. In practice, each erase causes additional writes (i.e., 
write amplification ifTsl ) as it moves data across blocks, so 
T should be smaller Here, we fix T = 10"^ seconds. 

We compare the reliability dynamics of RAID-5 and 
different variants of Diff-RAID. For RAID-5, each drive 
holds a fraction jA:i of parity chunks; for Diff-RAID, we 
choose the parity distribution (i.e., p^'s for < « < N) based 
on a truncated normal distribution. Specifically, we consider 
a normal distribution M{N + l,cr^) with mean iV + 1, 
and standard deviation a, and let / be the corresponding 



probability density function. We then choose pi's as follows 



Pi = 



rN+1 



0<i<N. 



(18) 



/o '^' f{x)dx' 

We can choose different distributions of pi by tuning the 
parameter a. Intuitively, the larger cr is, the more evenly p^'s 
are distributed. We consider three cases: cr = 1, cr = 2, and 
cr == 5. Suppose that TV = 9. Then for cr = 1, SSD A^ and 
SSD A^— 1 hold 68% and 27% of parity chunks, respectively; 
for cr = 2, SSD N, SSD TV - 1, and SSD N-2 hold 38%, 
30%, and 18% of parity chunks, respectively; for cr = 5, the 
proportions of parity chunks range from 2.8% (in SSD 0) 
to 16.6% (in SSD N). After choosing p^'s, the age of each 
block of SSD i (i.e., ki) can be computed via Equation (|5]l. 

B. Impact of Different Error Dynamics 

We now show the numerical results of RAID reliability 
based on the parameters described earlier We assume that 
drive replacement can be completed immediately after the 
oldest SSD reaches its rated lifetime. When the oldest drive 
is replaced, all its chunks (including any erroneous chunks) 
are copied to the new drive. Thus, the reliability (or the 
probability of no data loss) remains the same. We consider 
three error cases; error dominant, comparable, and recovery 
dominant cases, as described above. 
Case 1: Error dominant case. Figure [Sja) first shows 
the numerical results for the error dominant case. Initially, 
RAID-5 achieves very good reliability as all drives are 
brand-new. However, as SSDs wear down, the bit error rate 
increases, and this makes the RAID reliability decrease very 
quickly. In particular, the reliability drops to zero (i.e., data 
loss always happen) when the array performs around 5x10^ 
erasures. For Diff-RAID, the more evenly parity chunks 
are distributed, the lower RAID reliability is. In the error 
dominant case, since error arrival rate is much bigger than 
the recovery rate, the RAID reliability drops to zero very 
quickly no matter what parity placement strategy is used. We 
note that Diff-RAID is less reliable than traditional RAID- 
5 in the error dominant case. The reason is that for Diff- 
RAID, the initial ages of SSDs when constructing the RAID 
array are non-zero, but instead follow the convergent age 
distribution (i.e., based on A/s in Equation ^). When error 
arrival rate is very large, the array suffers from low reliab ility 
even if the array only performs small number of erasures. 
However, for RAID-5, since it is always constructed by using 
brand-new SSDs, it starts with a very high reliability. 
Case 2: Comparable case. Figure |3b) shows the results for 
the comparable case. RAID-5 achieves very good reliability 
initially, but decreases dramatically as the SSDs wear down. 
Also, all drives wear down at the same rate, the reliability 
of the array is about zero when all drives reach their erasure 
limits, i.e., when the system age is around 1.3 x 10^° 
erasures. Diff-RAID shows different reliability dynamics. 
Initially, Diff-RAID has less rehabihty than RAID-5, but 



the drop rate of the reliability is much slower than that of 
RAID-5 as SSDs wear down. The reason is that Diff-RAID 
has uneven parity placement, SSDs are worn out at different 
times and will be replaced one by one. When the worn-out 
SSD is replaced, other SSDs perform fewer erase operations 
and have small error rates. This prevents the whole array 
suffering from a very large error rate as in RAID-5. Also, 
the reliability is higher when the parity distribution is more 
skewed (i.e., smaller a), as also observed in 12J. 
Case 3: Recovery dominant case. Figure |5tc) shows the 
results for the recovery dominant case. RAID-5 shows high 
reliability in general. Between two replacements (which 
happens every 1.3 x 10^° erasures), its data loss probability 
drops by within 3%. Its reliability drops slowly right after 
each replacement, and its drop rate increases as it is close to 
be worn out. Diff-RAID shows higher reliability than RAID- 
5 in general, but the difference is small (e.g., less than 6% 
between Diff-RAID for ct = 1 and RAID-5). Therefore, in 
the recovery dominant scenario, we may deploy RAID-5 
instead of Diff-RAID, as the latter introduces higher costs 
in parity redistribution in each replacement and has smaller 
I/O throughput due to load imbalance of parities. 

C. Impact of Different Array Configurations 

We further study via our model how different array config- 
urations affect the RAID reliability. We focus on Diff-RAID 
and generate the parity distribution pi\ with cr = 1. Our goal 
is to validate the robustness of our model on characterizing 
the reliability for different array configurations. 
Impact of N. Figure|6la) shows the impact of the RAID size 
N . We fix other parameters as the same in the comparable 
case, i.e., n = IQ-'^, c = 0.4 x 10"^^ and M = 10*. 
The larger the system size, the lower the RAID reliability. 
Intuitively, the probability of having one more erroneous 
chunk in a stripe increases with the stripe width (i.e., A^+1). 
Note that the reliability drop is significant when N increases. 
For example, at 2.6 x 10^° erasures, the reliability drops from 
0.7 to 0.2 when N increases from 9 to 19. 
Impact of ECC. Figure |6tb) shows the impact of different 
ECC lengths. We fix /^ = IQ-^, M = 10"*, and TV = 9. We 
also fix the raw bit error rate (RBER) as 1.3 x 10"'' ||2l, and 
compute the uncorrectable bit error rate using the formulas 
in 1271 . Then as described in Section IV-AI we derive c 
for different ECCs that can correct 3, 4, 5 bits per 512 
byte sector, and the corresponding values are 4.4 x 10~^^, 
4.7 X 10"^*, and 4.2 x 10"^^, respectively. We observe that 
the RAID reliability drops to zero very quickly for 3-bit 
ECC at around 10^ erasures, while the RAID reliability 
for 5-bit ECC starts to decrease until the array performs 
10^^ erasures. This shows that the RAID reliability heavily 
depends on the reliability of each single SSD, or the ECC 
length employed in each SSD. 

Impact of M. Figure |6tc) shows the impact of the erasure 
limit M, or the endurance of a single SSD, on the RAID 
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reliability. We fix other parameters with /^j = lO^'^, iV = 9 
and c = 0.4 x 10^^'^. We observe that when M decreases, 
the RAID reliability increases. For example, at 1.3 x 10^" 
erasures, the RAID reliability increases from 0.85 to 0.99 
when M decreases from lOK to IK. Recall that the error 
rates increase with the number of erasures in SSDs. We 
now have the increase of bit error rates capped by the small 
erasure limit. The trade-off is that the SSDs are worn out 
and replaced more frequently with smaller M. 

D. Discussion 

Our results provide several insights into constructing 
RAID for SSDs. 

> The error dominant case may correspond to the low-end 
MLC or TLC SSDs with high bit error rates, especially 
when these types of SSDs have low I/O bandwidth 
for RAID reconstruction. Both traditional RAID-5 and 
Diff-RAID show low reliability. A higher degree of 
fault tolerance (e.g., using RAID-6 or stronger ECC) 
becomes necessary in this case. 

• When the error arrival and recovery rates are similar, 
Diff-RAID, with uneven parity distribution, achieves 
higher reliability than RAID-5, especially when RAID- 
5 reaches zero reliability when all SSDs are worn out 
simultaneously. This conforms to the findings in [2]. 

« In the recovery dominant case, which may correspond 
to the high-end single-level cell (SLC) SSDs that typi- 
cally have very small bit error rates, RAID-5 achieves 



very high reliability. We may choose RAID-5 over Diff- 
RAID in RAID deployment to save the overhead of 
parity redistribution in Diff-RAID. 
• Our model can effectively analyze the RAID reliability 
with regard to different RAID configurations. 

VI. Related Work 

There have been extensive studies on NAND flash-based 
SSDs. A detailed survey of the algorithms and data struc- 
tures for flash memories is found in ifTTI . Recent papers em- 
pirically study the intrinsic characteristics of SSDs (e.g., HI, 
|5|), or develop analytical models for the write performance 
(e.g., Q, ifTSl ) and garbage collection algorithms (e.g., 1231 ) 
of SSDs. 

Bit error rates of SSDs are known to increase with the 
number of erase cycles lfT2l . ll27l . To improve reliability, 
prior studies propose to adopt RAID for SSDs at the device 
level El, mi, lED, iall, Ea, EUI, or at the chip level I20j. 
These studies focus on developing new RAID schemes that 
improve the performance and endurance of SSDs over tra- 
ditional RAID. The performance and reliability implications 
of RAID on SSDs are also experimentally studied in |fT9l . In 
contrast, our work focuses on quantifying reliability dynam- 
ics of SSD RAID from a theoretical perspective. Authors 
of Diff-RAID (21 also attempt to quantify the reliability, 
but they only compute the reliability at the instants of SSD 
replacements, while our model captures the time-varying 



nature of error rates in SSDs and quantifies the instantaneous 
reliability during the whole lifespan of an SSD RAID array. 
RAID was first introduced in tSlI and has been widely 
used in many storage systems. Performance and reliability 
analysis on RAID in the context of hard disk drives has 
been extensively studied (e.g., see H, JU, ED, (HI, ll35l ). 
On the other hand, SSDs have a distinct property that their 
error rates increase as they wear down, so a new model is 
necessary to characterize the reliability of SSD RAID. 

VII. Conclusions 

We develop the first analytical model that quantifies the re- 
liability dynamics of SSD RAID arrays. We build our model 
as a non-homogeneous continuous time Markov chain, and 
use uniformization to analyze the transient state of the RAID 
reliability. We validate the correctness of our model via 
trace-driven DiskSim simulation with SSD extensions. 

One major application of our model is to characterize the 
reliability dynamics of general RAID schemes with different 
parity placement distributions. To demonstrate, we compare 
the reliability dynamics of the traditional RAID -5 scheme 
and the new Diff-RAID scheme under different error patterns 
and different array configurations. Our model provides a 
useful tool for system designers to understand the reliability 
of an SSD RAID array with regard to different scenarios. 
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Appendix 

A. Proof of Theorem Q] in Section I///- A I 

The computation of the system state in Equation (fT4l i is 
intuitive since the truncation point is Ui in interval [IsT, {1 + 
l)sT). In the following, we focus on the derivation of the 
error bound. Note that 7r((^ + l)sT) is the system state at 
time {I + l)sT for the CTMC {X{t)}. Moreover, given the 
state at time IsT, n{{l + l)sT) is computed iteratively by 
computing tt{{Is + 1)T), 7v{{ls + 2)T), ..., 7r((Ls + s)T) 
sequentially. During each step, e.g., deriving 7r((fc + l)r) 
from 7r(fcr) {Is < k < {I + l).s), uniformization is used. 
Without loss of generality, we can let Afc — Ai (Is < k < 
{l + l)s) as A; > maxo<i<5+i \~qi,i{k)\ for all k {Is < k < 
{I + l)s). Since Q^, is denoted as the generator matrix of the 
homogeneous CTMC {X{t), kT < t < {k + 1)T}, to apply 
the uniformization, we let Pk ~ -^ +x^ (Is < k < {l + l)s). 
Since every element of P^ is a linear function of k, the 
difference between two matrices Pk+i — Pk must be the 
same for all k, and we denote it by D. Formally, we have 

D = Pk+i-Pk, ls<k<{l + l)s (19) 

Now, we can easily find that Pk = Pis + (fc — ls)D {Is < 
k < {l + l)s - 1). Moreover, since JP; = / + 2i and Qi is 



defined as ''-'" ^^ in Equation ( llll ). we have 



E 



(i+l)s-l 
k=ls 



k 



^(l+l)s-l 



S 

Pis + ^^^D. 



UZir iPis + ik-Ls)D) 



(20) 



Note that based on the analysis of {X{t), kT <t < {k + 
l)T} by using uniformization, 7r((fc + l)r) {Is < k < (l + 
l).s) can be rewritten as follows. 



-A,T A,TPfc 



7r((fc + l)r) = 7r(fcT)e-^''e 



7r(fcT)e 



-A,T A,T(P,, + (fc-is)r>) 



Observe that most elements in the difference matrix D 
are zero, and the non-zero elements are all very small, by 
examining the elements in DPis and the elements in PisD, 
we find that the multiplication of matrix D and matrix Pis 
can be assumed to be commutative, or DPis ~ PisD- 
Therefore, we have 

tt{{1 + 1)sT) « 7r(/.sr)e-^'^^e^'^5:S?="'-P. 

Now, the upper bound of the error ?; is derived as follows. 

li^\^{{l + l)sT)-n{{l + l)sT)\\^ 

= msT)f:e-^--i^P:-nilsT) 

n=0 



^-A,sT^AiTsPii\ 



n=Ui+l 

<\\n{lsT) - 7r(Zsr)||ie-^'^V'^^ll-P'll- 



1-E 



^_^^,Ti_AisTr 



=e;-i 



1-E 



,-A,.,T(Aisrr 



n=0 



The last equation comes from the fact that ||P;||oo = 1 as 
Pi^I+f^, and Q_i == ||^(/sT) -7r(/sT)||i. Therefore, 
we have the results stated in Theorem [T] I 



